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In statistical data analysis, penalized regression is considered an attractive approach for its 
ability of simultaneous variable selection and parameter estimation. Although penalized 
regression methods have shown many advantages in variable selection and outcome 
prediction over other approaches for high-dimensional data, there is a relative paucity 
of the literature on their applications to hypothesis testing, e.g., in genetic association 
analysis. In this study, we apply several new penalized regression methods with a novel 
penalty, called Truncated /_i -penalty (TLP) (Shen et al., 2012), for either variable selection, 
or both variable selection and parameter grouping, in a data-adaptive way to test for 
association between a quantitative trait and a group of rare variants. The performance 
of the new methods are compared with some existing tests, including some recently 
proposed global tests and penalized regression-based methods, via simulations and an 
application to the real sequence data of the Genetic Analysis Workshop 17 (GAW17). 
Although our proposed penalized methods can improve over some existing penalized 
methods, often they do not outperform some existing global association tests. Some 
possible problems with utilizing penalized regression methods in genetic hypothesis 
testing are discussed. Given the capability of penalized regression in selecting causal 
variants and its sometimes promising performance, further studies are warranted. 
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1. INTRODUCTION 

Genome-wide association studies (GWAS) have uncovered many 
common variants (CVs) associated with complex diseases, but 
the proportion of variance explained by the identified CVs is 
often low (Maher, 2008). With the recent advance of sequencing 
technologies, analysis of rare variants (RVs) has become a feasi- 
ble alternative. Recent studies have demonstrated that some RVs 
are associated with complex disease. For example, Kotowski et al. 
(2006) found that multiple RVs in gene PCSK9 are associated with 
plasma levels of low-density lipoprotein cholesterol. 

In this study, we propose applying some new penalized regres- 
sion methods to test for association between a quantitative trait 
and multiple RVs. Differing from the usual application of penal- 
ized regression methods to variable selection or risk prediction 
for high-dimensional data (Kooperberg et al, 2010; Austin et al., 
2013), here we focus on their application to hypothesis testing 
on a quantitative trait in a relatively low-dimensional setting. 
In such a setting, one commonly used statistical test is the 
_F-test in linear regression. For example, in simple regression, 
a trait Y is regressed on each of multiple variants sequentially. 
However, because of the extremely low minor allele frequency 
(MAF) of a RV, a test to detect the association between a trait 
and a single RV might be low powered. Also, this approach 
may be too conservative due to a stringent control for mul- 
tiple testing, e.g., by the Bonferroni correction to control the 
family-wise error rate. In addition, ultimately, complex diseases 



are expected to be affected by a combination of multiple genetic 
variants. Thus an analysis in which a group of variants are 
tested simultaneously for their joint effects on the trait may 
be more powerful. In multiple regression, to assess any asso- 
ciation between a trait and k RVs, all k RVs are added to a 
regression model. However, as k increases, the statistical power 
might decrease due to the cost of large degrees of freedom (DF), 
k. To avoid the large DF and to aggregate information across 
multiple RVs, one common strategy is to pool or collapse mul- 
tiple RVs in a region or gene (Li and Leal, 2008; Madsen and 
Browning, 2009). One such attempt is the Sum test (Pan, 2009), 
which was developed to utilize joint effects of multiple variants 
while reducing the DF. With only 1 DF, the Sum test enhances 
power under some scenarios (Chapman and Whittaker, 2008; 
Pan, 2009). However, it is noted that the performance of the 
Sum test depends on the directions of the variants' associations 
with a trait. Thus, in an extreme case where a half of the vari- 
ants are positively associated with the trait and the other half 
are negatively associated with similar effect sizes, the positive 
and negative effects may cancel out, leading to the poor perfor- 
mance of the Sum test and other burden tests (Han and Pan, 
2010; Li et al, 2010). In addition, in the Sum test or other 
pooling-based burden tests, combining or collapsing all variants 
into just one group ignores the variants' possibly varying effect 
sizes, and thus may not work well in those situations. In par- 
ticular, the Sum test and many burden tests perform poorly if 
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many null (i.e., non-associated) RVs are present (Basu and Pan, 
2011). Consequently, the Sum test and other pooled association 
tests might be low powered. 

On the other hand, to deal with high-dimensional genetic and 
genomic data, penalized regression methods have received much 
attention, especially those based on the Lasso penalty (Tibshirani, 
1996; Kooperberg et al., 2010). Penalized regression has been 
considered attractive for its potential of simultaneous variable 
selection and parameter estimation. In particular, several authors 
have studied the performance of penalized regression in genetic 
association analysis (Guo and Lin, 2009; Tzeng and Bondell, 2010; 
Zhou et al., 2011). However, the penalties used therein are typi- 
cally based on the Lasso, which is known to give biased parameter 
estimates and possibly inconsistent variable selection. In contrast, 
one of the very recently developed state-of-the-art penalties, the 
truncated Li-penalty (TLP) (Shen et al., 2012), overcomes the 
above shortcomings of Lasso. The TLP approximates the Lq-Ioss 
and reduces the bias of a parameter estimate from the popular 
Lasso or Li -penalty. To investigate whether an application of TLP 
would boost statistical power in genetic association testing, in 
this study we apply the TLP for variable selection, denoted TLP- 
S, and for both variable selection and parameter grouping (Zhu 
et al., 2013), denoted TLP-SG, in a data-adaptive way, to select 
and group variants to reduce the DF as in the Sum test, while 
reducing the downward bias of the parameter estimates based 
on an Li-type penalty. We compare the TLP-S and TLP-SG to 
the Lasso and graph-fused Lasso (gflasso) (Kim and Xing, 2009). 
The gflasso also pursues parameter grouping with an Li -penalty. 
Specifically, the gflasso shrinks two variants' effect sizes toward 
each other by penalizing their difference — r(j,/)jSj/|, where 
either r(j,/) = 1 (called gflassOr=i) or r(j,j') is the sign of the 
correlation between the two variants j and/ (called flasso,-= cor)- 
There are two main differences between our proposed TLP-SG 
and gflasso. First, TLP-SG shrinks the absolute values of the two 
parameters toward each other by penalizing \ — I /S/ 1 1 • In this 
way, it desirably allows two variants to have similar effect sizes 
but opposite association directions. However, such a penalty is 
non-convex and thus computationally more challenging. Second, 
by the use of TLP-based grouping (see details later), TLP-SG 
shrinks \Pj\ and toward each other only if their difference 
is relatively small (as compared to a tuning parameter to be 
decided) , thus, for example, avoiding severely biasing the estimate 
of the effect size of an associated variant toward 0 by shrink- 
ing it toward the null effect of a null variant. We note that, 
although penalized regression methods have been widely used 
and studied, their applications to the current context with RVs 
are much more limited; in particular, we are not aware of any 
applications of TLP-S, TLP-SG and gflasso to association testing 
of RVs. 

This paper is organized in four sections. Section 2 provides a 
brief review of some existing association tests to be compared, 
and then introduces our proposed TLP-based tests. In section 
3, we compare the performance of the methods with simulated 
data and with an application to the Genetic Analysis Workshop 
17 (GAW17) sequence data (Almasy et al, 2011). Finally, the 
Discussion section summarizes the results, and suggests some 
potential problems for future study. 



2. METHODS 

2.1. SOME EXISTING ASSOCIATION TESTS 

We briefly review some existing global tests based on the ordinary 
least squares (OLS) estimates. Given n independent observations 
{Yi, Xi), i = I, . . . , n, with 7,- as a quantitative trait and a vec- 
tor Xi = (X,i , . . . , Xik) as genotypes of k variants for subject i, we 
would like to test for any possible association between the trait 
and genotypes. We use the dosage coding for Xy-. Xy = 0, 1, or 2, 
representing the count number of one of the two alleles present 
in variant; of subject i. A multi-locus association analysis is based 
on fitting a linear model, 

k 

; = i 

where the errors 6, are assumed to be independently drawn 
from N{0, a^), a Normal distribution with mean 0 and vari- 
ance cr^. A global test of any possible association between the 
trait and k variants can be formulated as testing on the mul- 
tiple parameters fijS for = 1, . . . , fc with null hypothesis Hq : 
P = (Pi, . . . , Pi^y = 0 by an _F-test, which is based on the OLS 
estimates that minimize the residual sum of squares. A potential 
problem with the test is the power loss due to the large variance 
of f}j since the MAFs of RVs are small. 

We also apply other four association tests: the Score, the sum 
of squared score (SSU), its weighted version SSUw (Pan, 2009), 
and the univariate minP (UminP) tests. The Score test is popular 
in general statistics while the UminP test is most widely used for 
CVs in GWAS; on the other hand, Basu and Pan (2011) showed 
that the SSU and SSUw tests were powerful in RV association test- 
ing for case-control studies. Here, as a secondary contribution, 
we extend the SSU and SSUw tests to the case with a quantita- 
tive trait. All the four tests are based on the score vector U and its 
covariance matrix V under Hq: 

n 

i= 1 

n 

V = coviu) = - - 

i= 1 

where ? = E"= i i'./". ^ = E"= i and = Ef= i (i'. " 
Y)^ /{n — 1) is the estimate of cr^ under Hq. The corresponding 
four test statistics are 

Tssu = U'^U, 

Tssuw = VJ 1 U with Vd = Diag( V) , 
TummP = rnaxUf/Vj, 

where Uj is thejth element of U and Vj is the (j, j)th diagonal ele- 
ment of V. Under Hq, asymptotically Tscore has a xl distribution, 
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each of Tssu and Tssuw has a mixture of chi-squared distribu- 
tions (Pan, 2009), and the p-value of TjjminP can be numerically 
obtained (Conneely and Boehnke, 2007). 

Next, we extend the Sum test (Pan, 2009) and its modi- 
fied version, a data-adaptive Sum (aSum) test (Han and Pan, 
2010), to the case with a quantitative trait. The Sum test 
was originated to model multiple variants jointly while induc- 
ing a minimum number of DF: while including all the vari- 
ants in the linear model, it assumes that the variants all have 
the same effect size (and direction), /j^, as in the following 
model: 



(2) 



Fitting (Equation 2) is equivalent to conducting a simple regres- 
sion of y on a new covariate, the sum of the genotypes over the 
multiple variants. To address the question of whether any associa- 
tion between the disease and the variants exists, one simply needs 
to test Ho : ySc = 0, without the need for multiple testing adjust- 
ment. The main advantage of the Sum test is that, because it tests 
on only one parameter Pc> there will be no power loss due to the 
large DF. The common association parameter f)c is a weighted 
average of the individual /Jm,i, ■ ■ • , Pm.k in the marginal mod- 
els Y, = Pm,o + X,jPM.j + €,j for j=l,...,k (Pan, 2009). On 
the other hand, the main problem of the Sum test is its depen- 
dence on the signs of /SmjS or on the coding of each variant (i.e., 
which allele is chosen as the reference category). If the signs are 
not the same, the test may have a quite small fie and thus low 
power. To overcome the limitation of the Sum test, Han and Pan 
(2010) proposed the aSum test for a case-control study design, 
which can be equally applied to quantitative traits as the fol- 
lowing. (1) For each variant j, flip its coding to X* = 2 — Xj if 

f^M.j < 0 and its p-value pM,; < in the marginal model; other- 
wise use the same codingX* = Xj. (2) Fit the model (Equation 2) 
with the new coding X*. To test Hq in the aSum test, we use 
a permutation-based log-likelihood ratio test (LRT), which is 
asymptotically equivalent to the score test. For the choice of ao, 
we use the same value as recommended by Han and Pan (2010), 
0.1, to prevent reduced power when a too small or too large ao is 
used. 

While the _F-test is based on OLS estimates, in next section we 
apply some penalized regression methods, the Lasso, gflasso and a 
recently developed TLP for either only variable selection (TLP-S) 
or both variable selection and parameter grouping (TLP-SG). In 
short, both the Lasso and TLP-S consider only variable selection, 
while the gflasso and TLP-SG pursue parameter grouping along 
with variable selection to improve power by striking a better bal- 
ance between goodness-of-fit and reduced DF in the joint model 
(Equation 1). 

2.2. PENALIZED REGRESSION BASED TESTS 

2.2. 1. Parameter estimation from penalized regression 

Given a vector of traits Y = (Yi, . . . , Y,,)' and a design matrix for 
k variants X = (X.i, . . . , X.k), the Lasso estimate of P is obtained 
from the penalized least squares function: 



1 

P = argmin-ll Y - Xpf + kV] 



(3) 



where a large k automatically yields some components of f) as 
0, realizing variable selection. While Lasso does effective vari- 
able selection, its estimates are always biased. To overcome the 
issue, Shen et al. (2012) proposed a truncated Lasso(Li)-penalty 
(TLP) Jri\x\) = minO^, 1), which, as r ^ 0+, tends to the Iq- 
loss, /( |x| / 0). The degree of approximation by TLP is controlled 
by a tuning parameter, t . See Figure 1 for a display over the 
different values of t. Then the TLP-estimate ji is obtained from 



argmin-||y - Xpf + ki Y]r{\Pj\), 



(4) 



and we denote (Equation 4) as TLP-S. The most interesting fea- 
ture of the TLP is that only smaller | /ij | 's less than a threshold r are 
penalized, hence realizing variable selection (if some are shrunken 
to 0) while avoiding penalizing larger |/)lj|'s and thus leading to 
their almost unbiased estimates. 

While both the Lasso and TLP-S consider only variable selec- 
tion, an alternative way to reduce model complexity is grouping 
pursuit (Shen and Huang, 2010). To investigate the grouping 
effects on a test's power, we apply two recent penalized group- 
ing methods, gflasso and TLP-SG. The P estimate from gflasso is 
based on the following objective function: 

1 ^ 
P = argmin-ll Y-Xpf + kiY, \f^,\ 



+ k2Y,\fi,-r{j,j')f}j,l 



(5) 



]<}' 
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FIGURE 1 I Truncated Li-penalty (TLP) function JA\?j\) with t = 0.2, 
0.5, and 1 (as solid, dashed and dotted lines, respectively). 
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where the first penalty is used for variable selection and the sec- 
ond is to encourage parameter grouping. r{j,j') is the sign of 
the correlation between two variants X.j and X.ji, which is used 
to approximate the target |/8j| ^ this method is denoted 
gflassOr = cor- On the other hand, if r{j,f) = 1 is used, the penalty 
targets ~ Pf- 

The TLP-SG estimate of f) comes from 

1 ^ 
P = argmin-||Y - Xpf + Ai V/r(l,S;|) 

+ ^2^/r(|l/S;|-|/S/l|), (6) 

where the first penalty is for variable selection while the sec- 
ond shrinks the difference of \Pj\'s if a difference is within the 
upper bound t. The number of the groups of equal parame- 
ter estimates is a decreasing function of X2. Thus, the tuning 
parameters k2, r) are selected to balance between the model 
complexity and model goodness-of-fit, which presumably may 
contribute to enhanced power. As a comparison, in the Sum test 
all parameters (or variants) are forced to belong to the same sin- 
gle group even if the variants' associations with the trait are quite 
different both in effect sizes and directions; the TLP-SG method 
attempts to conduct a more precise grouping over all variants in 
a data-adaptive way. 

To compute fi in Lasso, gflasso, TLP-S and TLP-SG, we used 
the Feature Grouping and Selection Over an Undirected Graph 
(FGSG) package of Yang et al. (2012), which is a C library with 
interface to MATLAB and is quite fast to run. Its computing effi- 
ciency allowed us to estimate separate tuning parameters for each 
permuted dataset to control the type I error as explained in the 
next section. 

2.2.2. Hypothesis testing 

To test the null hypothesis Ho : yS = 0 in Equation (1), we con- 
duct a permutation-based test, in which the p-value is calculated 
by comparing a test statistic T applied to the original dataset to the 
ones Tg''' applied to the B permuted datasets for fa = 1 , . . . , B. We 
use permutation to control the Type I error since the null distri- 
bution of a test statistic based on a penalized regression estimate 
is in general difficult to obtain. The permutation-based testing 
procedure follows: 

Step L With the original data {(Y,-,X,)}, we solve a penalized 

regression problem to obtain /J in Equation (1). 
Step 2. Calculate a test statistic T = T(P). 
Step 3. By repeatedly permuting the observed Y of the original 

(b) 

data, we obtain B sets of permuted data {(7- ,X,)} for 
b = I, . . . , B. For each permuted data set, {{Y^^\ Xi)},we 
repeat the Steps 1 and 2, obtaining the null statistics Tq . 
Step 4. The final p-value is Ylb= 1 ^ < T^o''^) /B. 

We apply each of several test statistics in Step 2. First, across all 
penalized methods, we use a 1 -df _F-statistic ( 1 -df ) to test the asso- 
ciation between Y and Xf), where /J is the penalized estimate of p 



in Step L Specifically, we fit a linear model 

Yi = ao+ (x,p^ a + ei, 

and test Hq : a = O.This 1-dftest uses variable selection and pos- 
sibly parameter grouping result from the corresponding penalized 
method, while allowing testing with only 1 DF. Second, for 
TLP-SG, we also apply the corresponding SSU and SSUw tests, 
where the test statistics Tssu and Tssuw are both based on the 
selected variables from the corresponding penalized estimates. 
Specifically, 

Tssu= U*'U*, 
TssUw=U*'{V*r'U* with V*=Diag{V*), 

where U* is a sub-component vector of the score vector U corre- 
sponding to \Pj\ 7^ 0, and \Pj\ > 0.001 is considered as non-zero. 
Similarly, V* is the corresponding sub-matruc of the covariance 
matrix V. Note that the grouping information is not used. 

2.2.3. Selection of tuning parameters 

To select the suitable tuning parameters, we apply a grid-search 
with Akaike's information criterion (AIC) (Akaike, 1974): 

AIC= -2logL + 2p, 

where log! = (— n log (a^) — n — p — l) /2 is the log-likelihood 
with the penalized estimate plugged-into model (Equation 
1), and CT^ = Yl"= 1 (Y, - Po - X,pf/in - p - 1). The effective 
number of the parameters, p, in AIC is computed as the number 
of non-zero | Pj | 's for Lasso and TLP-S, as the number of non-zero 
unique Pfs for gflassOr= 1, and as the number of non-zero unique 
l/jj i's for gflassOr = cor and TLP-SG, respectively. For k in Lasso, 
the one resulting in the smallest AIC out of 50 equally spaced 
points in [0.001,10] is selected. Similarly, the values of each of ki, 
ki and r in other methods are searched over five equally spaced 
grid points of [0.001, 1], [0.001,0.5], and [0.001,0.5], respec- 
tively. For each permuted dataset (Yf'\ X,) for = 1, . . . , B, we 

also estimate its own {kf\ k^^^ r'''') to properly control the type 
I error. 

3. RESULTS 
3.1. SIMULATIONS 

We consider two simulation schemes. In the first scheme, we gen- 
erate only RVs with a total of 200 replicates and n = 400 in each 
replicate. The permutation size is set as B = 100. For each repli- 
cate, to generate k variants including six causal ones in linkage 
disequilibrium (LD), as in Wang et al. (2007), two latent vectors 
from multivariate normal distribution MVN(0,J?) are simulated, 
where _R has a first order auto regressive (ARl) structure; the asso- 
ciation between any two elements of the latent vector decreases by 
p = 0.8 times as 1 lag increases. Then, the vector is dichotomized 
to yield a haplotype with the minor allele frequency (MAF) 
of each variant randomly chosen between 0.005 and 0.01. The 
genotype data X, = (Xn, . . . , X/t)' for sample i is obtained by 
adding two haplotypes together. Finally, 7, is generated from 
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the randomly located six causal variants with cr^ = 2 in model 
(Equation 1), where the intercept jSq is set as 0.3 throughout the 
simulations. The considered three cases are: 

Case l:fi = ( 0.9, 0.9, 0.9, 0.9, 0.9, 0.9, 0, .... 0 )' 

6 k-6 

Case 2: = ( 1.2, 1.2, 1.2, -1.2, -1.2, -1.2, 0, . . . , 0)' 

6 Jt-6 



k-6 



In each case, we vary the number of non-causal RVs k-6 from 0 
to 24 so that the total number of RVs, k, ranges from 6 to 30. The 
Type I error is computed from the Y under Hq : p = (0, . . . , 0)'. 

In the second scheme, multiple RVs and two CVs are gener- 
ated to mimic the GAW17 data we use later. The frequency of one 
allele for each CV is randomly distributed between 0.2 and 0.7, 
and CVs may or may not be chosen as a causal variant in each 
replicate. When a CV is randomly selected as a causal variant, 
its effect size fij is scaled down to Pj/lO in the following cases 
to prevent its dominating association with the outcome. The 
considered three cases for mixed RVs and CVs are: 



Case 1: jS = ( 1, 1, 1, 1, 1, 1, 0, . . . , 0 )' 

k-6 




Figure 2 displays the TLP-S and TLP-SG solution paths of \Pj\ 
over a tuning parameter given other(s), where two horizontal 
lines at 1.2 and 0 give the true parameter values for Case 2 set-up 
with only RVs. In contrast to piece-wise linear solution paths of 
the Lasso estimates, the TLP solution paths are like step functions 
as expected from an Lq -penalty (i.e., best subset selection). 

Table 1 presents the simulation results for the RVs only set- 
ups. The Type I error rates seem to be properly controlled under 
the null for all cases, though there are some slightly inflated num- 
bers, possibly due to the relatively small number of replicates 
and/or permutation numbers. Under the alternative hypothe- 
sis, in Case 1 where the causal associations are all in the same 
direction, the Sum or aSum test beats other methods. Within 
the class of penalized regression methods, TLP-SG with the SSU 
or SSUw test statistics is most powerful; in particular, TLP-SG 
with the SSUw statistic performs better than the f-test regard- 
less of the number of non-causal RVs included. There seems to 
be no gain with grouping in TLP-SG as compared to no group- 
ing in TLP-S, and the Idf-test of TLP-SG works better than 
gflassOr= cor unless the number of non-causal RVs is large at 24. 
Overall, penalized regression methods do not significantly out- 
perform the power over the Sum and aSum tests. In Cases 2 
and 3, where the causal effect directions are mixed, the Sum 



test works poorly as expected, while the aSum test has higher 
power. Overall, either the SSU or SSUw test is the winner. In par- 
ticular, the TLP-S- and TLP-SG-based tests do not significantly 
improve over the SSU and SSUw tests, though they may per- 
form better than those based on the Lasso and gflasso. Again, 
a comparison between TLP-S and TLP-SG reveals that param- 
eter grouping does not seem to contribute much to increased 
power. 

The results of the mixed RVs and CVs set-ups are listed in 
Table 2. Note that, as discussed in Basu and Pan (2011), with 
mixed RVs and CVs, the SSU test might not perform well. Overall, 
the SSUw test is the winner. The penalized methods can perform 
well in some situations, but they do not always outperform the 
SSUw test. Among the penalized methods, the proposed TLP-S 
and TLP-SG are competitive against the Lasso and gflasso. 

An advantage of penalized methods over global tests is the 
formers' ability for variable selection, narrowing down possible 
causal variants. We note that causal variant selection is an under- 
studied problem in genetics, which will become more important 
when we transition from association studies to causal inference. 
On the other hand, variable selection via penalized methods or 
any other methods has yet been fully investigated in the current 
context with large «, small k, and more importantly with RVs. 
In Table 3, we investigate their variable selection performance for 
one simulation set-up; the results for other set-ups are similar 
and thus omitted. We show the mean numbers of true positives 
(TP) and false positives (FP), where a \Pj\ > 0.001 is counted as 
a positive (i.e., non-zero). As expected, the OLS estimates (and 
the global tests) cannot conduct variable selection with the mean 
TP and mean FP close to their maximum possible values. Among 
the penalized methods, a method tends to be either more conser- 
vative (fewer FP and fewer TP at the same time) or more liberal 
(higher FP and higher TP). If we look at the ratio of FP over TP, it 
seems that the Lasso and TLP-SG are best with the highest ratio, 
especially for a larger number of non-causal RVs. 

We compare the performance of the parameter estimates in 
Table 4 for one simulation set-up; the results for other set-ups 
are similar and thus omitted. As expected, the OLS estimates are 
almost unbiased, but with the largest mean squared errors (MSEs) 
due to their large variability. The penalized estimates all have 
smaller MSEs and larger biases than the OLS estimates. Among 
the penalized methods, the TLP-S and TLP-SG estimates have 
much smaller biases, but larger variances and thus larger MSEs 
than those of Lasso and gflasso. In particular, for a causal effect 
iPc), Lasso and gflasso shrink it more toward 0, while TLP-S and 
TLP-SG give much less biased estimates. 

3.2. MINI-EXOME SEQUENCE DATA 

We apply the methods to the mini-exome sequence data from 
the GAW17 (Almasy et al, 2011). The data set consists of 3205 
autosomal genes with 24,487 variants on 697 subjects. The geno- 
types are obtained from the sequence alignment files provided 
by the 1000 Genomes Project for the pilot 3 study. The GAW17 
data include 200 replicates of three simulated quantitative traits 
named Ql, Q2, and Q4, where only Ql and Q2 were influenced by 
genetic factors. Here we use Q2, which is determined by 72 vari- 
ants in 13 genes. The true effect sizes of all variants range from 0.2 
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FIGURE 2 I Solution paths of Idyl's in a simulated dataset of Case 2 with 
k = 22 RVs for TLP-S and TLP-SG over the values of a tuning parameter 
given other(s). Tlie true values of \fij\'s at 1.2 and 0 are given by two 



honzontal lines. (A) TLP-S: r = 0.15. (B) TLP-S: M = 0.1. (C) TLP-SG: 

U2. r) = (0.01,0.15). (D) TLP-SG: U, . t) = (0.02,0.15). (E) TLP-SG: {X^,X2) = 

(0.1,0.01). 



to 1.2; all variants are positively associated with the trait Q2 but 
in differential magnitudes. 

In this study, we test on each of all causal genes (PLAT, SREBFl, 
SIRTl, VLDLR, VNN3, PDGFD, BCHE, INSIGl, LPL, RARE, 
VNNl, and VWF) except GCKR, which contains just one SNP 
The number of causal variants (nC) in each gene affecting Q2, 



and some summary statistics of their MAFs and pairwise correla- 
tions (COR) are listed in Table 5. Within each gene, most variants 
are RVs, but a few are CVs with their MAFs larger than 5%. First, 
we test for any association between Q2 and all variants gene by 
gene as shown in Table 6, and then test on each gene without its 
CVs as shown in Table 7. 
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Table 1 | Empirical Type I error and Power at the nominal level a = 0.05 based on 200 replicates for the RVs only set-ups with six causal RVs 
and a varying number of non-causal RVs. 



Model fitting 



Test 



# of non-causal RVs 



# of non-causal RVs 





statistics 


0 


8 




16 


24 


0 


8 


16 


24 










Null 








Case 1 






OLS 


F-test 


0.030 


0.080 




0.040 


0.060 


0.715 


0.480 


0.340 


0.260 


OLS 


Score 


0.030 


0.080 




0.035 


0.055 


0.710 


0.470 


0.320 


0.245 


OLS 


SSU 


0.030 


0.060 




0.045 


0.045 


0.830 


0.660 


0.510 


0.405 


OLS 


SSUw 


0.035 


0.080 




0.055 


0.060 


0.810 


0.625 


0.500 


0.380 


OLS 


UnninP 


0.045 


0.070 




0.050 


0.035 


0.675 


0.445 


0.360 


0.310 


OLS 


Sum 


0.055 


0.075 




0.040 


0.075 


0.915 


0.685 


0.525 


0.460 


OLS 


aSum 


0.035 


0.065 




0.035 


0.060 


0.910 


0.715 


0.575 


0.520 


Lasso 


Idf 


0.055 


0.075 




0.050 


0.080 


0.710 


0.415 


0.325 


0.270 


gflaSSOr= cor 


Idf 


0.035 


0.080 




0.050 


0.090 


0.690 


0.415 


0.240 


0.295 


gflassOr = i 


Idf 


0.035 


0.070 




0.050 


0.075 


0.685 


0.375 


0.225 


0.275 


TLP-S 


Idf 


0.050 


0.085 




0.050 


0.075 


0.720 


0.450 


0.305 


0.255 


TLP-SG 


Idf 


0.055 


0.085 




0.055 


0.070 


0.700 


0.450 


0.290 


0.250 


TLP-SG 


SSU 


0.055 


0.080 




0.040 


0.060 


0.700 


0.520 


0.440 


0.390 


TLP-SG 


SSUw 


0.040 


0.075 




0.045 


0.070 


0.790 


0.500 


0.365 


0.320 










Case 2 








Case 3 






OLS 


F-test 


0.635 


0.515 




0.440 


0.455 


0.745 


0.640 


0.550 


0.490 


OLS 


Score 


0.625 


0.500 




0.425 


0.395 


0.745 


0.635 


0.525 


0.470 


OLS 


SSU 


0.590 


0.530 




0.505 


0.445 


0.710 


0.645 


0.595 


0.555 


OLS 


SSUw 


0.570 


0.505 




0.475 


0.445 


0.715 


0.660 


0.570 


0.525 


OLS 


UminP 


0.450 


0.410 




0.400 


0.310 


0.665 


0.595 


0.425 


0.425 


OLS 


Sum 


0.145 


0.125 




0.145 


0.100 


0.485 


0.310 


0.260 


0.215 


OLS 


aSum 


0.450 


0.430 




0.355 


0.340 


0.665 


0.590 


0.535 


0.500 


Lasso 


Idf 


0.615 


0.465 




0.405 


0.390 


0.765 


0.585 


0.465 


0.435 


gflaSSOr= cor 


Idf 


0.620 


0.530 




0.435 


0.480 


0.765 


0.600 


0.480 


0.520 


gflassOr = i 


Idf 


0.615 


0.535 




0.435 


0.425 


0.750 


0.585 


0.475 


0.495 


TLP-S 


Idf 


0.615 


0.505 




0.455 


0.425 


0.760 


0.630 


0.530 


0.475 


TLP-SG 


Idf 


0.615 


0.485 




0.445 


0.415 


0.755 


0.605 


0.450 


0.450 


TLP-SG 


SSU 


0.565 


0.470 




0.460 


0.445 


0.705 


0.605 


0.510 


0.525 


TLP-SG 


SSUw 


0.585 


0.505 




0.460 


0.415 


0.745 


0.585 


0.485 


0.475 



Maximum power in bold. 



In Table 6, when both RVs and CVs within a gene are included, 
the identity of the most powerful test differs across the genes: the 
F-test is the winner for the genes VLDLR, VNN3, PDGFD, and 
LPL; however, for the genes VLDLR, BCHE, VNNl, and VWF, 
the SSU or SSUw test is the best. The two gflasso-based tests work 
quite similarly over all genes. The TLP based tests perform best 
for the genes SREBFl, RARAB, VNNl, and INSIGl. After remov- 
ing a few CVs in each gene (Table 7), the SSU test recovers good 
power for the genes PDGFD, BCHE and LPL. The Sum test is the 
winner for gene BCHE, while the F-test based on the OLS esti- 
mates perform best for genes VNN3, SREBFl, and PDGFD. For 
gene VNNl, the TLP-SG with the SSU statistic has the highest 
power. 

A potential advantage of penalized regression is variable 
selection, which is missing from existing global tests. Table 8 
shows the results of causal variant selection by the penalized 
methods. Overall, each penalized method could eliminate some 



non-associated variants at the cost of omitting some causal ones. 
In general, in agreement with simulated data, the Lasso and TLP- 
SG seem to select fewest variants, including both TPs and FP, 
while TLP-S and gflasso give higher numbers of both TPs and FPs. 

4. DISCUSSION 

In this study we have conducted hypothesis testing to detect the 
association between a quantitative trait and multiple RVs based 
on some new penalized regression methods. In addition to the 
traditional use of penalized regression for variant selection, we 
have also considered several state-of-the-art grouping pursuit 
methods that smooth the effect sizes of the variants, either f}, or 
1/3,1, in a data- adaptive way, which can be considered as a general- 
ization of the Sum and other genotype pooling/collapsing-based 
burden tests. In particular, our proposed TLP-SG overcomes sev- 
eral limitations of the Sum and other burden tests. First, by vari- 
able selection, the result of TLP-SG is presumably less influenced 
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Table 2 | Empirical Type I error and Power at the nominal level a = 0.05 based on 200 replicates for the RVs + CVs set-ups with six causal 
variants and a varying number of non-causal ones. 



Model fitting 



Test 



# of non-causal variants 



# of non-causal variants 





siaiistics 


0 


8 




16 


24 


0 


8 


16 


24 










Null 








CasG 1 






OLS 


F-test 


0.025 


0.045 




0.065 


0.050 


0.760 


0.520 


0.355 


0.385 


OLS 


Score 


0.020 


0.045 




0.065 


0.035 


0.760 


0.515 


0.345 


0.350 


OLS 


SSU 


0.060 


0.050 




0.090 


0.030 


0.490 


0.210 


0.125 


0.110 


OLS 


SSUw 


0.040 


0.035 




0.060 


0.035 


0.845 


0.695 


0.510 


0.510 


OLS 


UnninP 


0.030 


0.055 




0.060 


0.025 


0.715 


0.540 


0.380 


0.410 


OLS 


Sunn 


0.055 


0.060 




0.075 


0.045 


0.695 


0.450 


0.315 


0.315 


OLS 


aSum 


0.050 


0.060 




0.065 


0.045 


0.665 


0.435 


0.325 


0.340 


Lasso 


Idf 


0.030 


0.045 




0.060 


0.045 


0.750 


0.515 


0.360 


0.375 


gflaSSOr= cor 


Idf 


0.030 


0.030 




0.070 


0.015 


0.760 


0.450 


0.275 


0.415 


gflassOr=i 


Idf 


0.030 


0.030 




0.070 


0.015 


0.765 


0.455 


0.290 


0.385 


TLP-S 


Idf 


0.035 


0.050 




0.050 


0.030 


0.750 


0.540 


0.360 


0.370 


TLP-SG 


Idf 


0.035 


0.035 




0.065 


0.045 


0.750 


0.515 


0.335 


0.315 


TLP-SG 


SSU 


0.075 


0.060 




0.055 


0.065 


0.495 


0.230 


0.140 


0.105 


TLP-SG 


SSUw 


0.030 


0.055 




0.055 


0.045 


0.845 


0.675 


0.435 


0.375 










Case 2 








Case 3 






OLS 


F-test 


0.800 


0.765 




0.720 


0.650 


0.655 


0.585 


0.415 


0.375 


OLS 


Score 


0.800 


0.755 




0.710 


0.630 


0.645 


0.580 


0.400 


0.360 


OLS 


SSU 


0.275 


0.175 




0.155 


0.160 


0.200 


0.140 


0.110 


0.105 


OLS 


SSUw 


0.715 


0.705 




0.715 


0.665 


0.640 


0.615 


0.485 


0.415 


OLS 


UminP 


0.640 


0.615 




0.550 


0.505 


0.530 


0.510 


0.370 


0.345 


OLS 


Sum 


0.190 


0.120 




0.125 


0.100 


0.195 


0.150 


0.090 


0.110 


OLS 


aSum 


0.345 


0.275 




0.270 


0.315 


0.290 


0.225 


0.195 


0.210 


Lasso 


Idf 


0.805 


0.695 




0.640 


0.585 


0.580 


0.555 


0.415 


0.360 


gflaSSOr= cor 


Idf 


0.810 


0.725 




0.625 


0.655 


0.595 


0.570 


0.420 


0.415 


gflassOr=i 


Idf 


0.805 


0.725 




0.620 


0.655 


0.590 


0.570 


0.435 


0.395 


TLP-S 


Idf 


0.790 


0.730 




0.680 


0.615 


0.600 


0.570 


0.395 


0.390 


TLP-SG 


Idf 


0.795 


0.730 




0.620 


0.600 


0.600 


0.555 


0.400 


0.310 


TLP-SG 


SSU 


0.310 


0.185 




0.165 


0.210 


0.205 


0.120 


0.125 


0.120 


TLP-SG 


SSUw 


0.750 


0.720 




0.650 


0.550 


0.675 


0.560 


0.460 


0.390 



Maximum power in bold. 

Table 3 | Mean numbers of TP(sd)/FP(sd) of the methods in Case 2 with both RVs and CVs. 



Method 



# of non-causal variants 



16 



24 



OLS 


5.9(0.2)/. 


5.9(0.3)/79(0.4) 


5.9(0.3)/1 5.7(0.5) 


6.0(0.2)/23.5(0.7) 


Lasso 


4.4(1.7)/. 


3.7(1.6)/2.9(2.1) 


3.5(1.7)/4.7(3.3) 


3.2(1.7)/5. 9(4.2) 


gflaSSOr= cor 


5.4(1.0)/. 


4.8(1. 5)/5.1(2.4) 


4.1(2.0)/8.0(5.0) 


3.5(2.2)/9.3(75) 


gflassOr=i 


5.2(1.1)/. 


4.5(1.6)/4.8(2.4) 


4.3(1.9)/8.9(5.4) 


4.1(2. 1)/12.3(8.7) 


TLP-S 


5.4(0.9)/. 


4.7(1.1)/4.3(1.5) 


4.4(1. 1)/75(2.1) 


4.3(1.1 )/11.0(2.9) 


TLP-SG 


4.7(2.0)/. 


4.3(1.8)/4.2(3.2) 


3.6(1.6)/5.3(4.5) 


3.5(1.5)/6.3(5.0) 



When k= 6, FP is 0 and denoted as ". " after "/'.' 

by the presence of many non-associated variants to be tested. 
Second, rather than pooHng all the variants into a single group 
or two groups, TLP-SG automatically determines the number 
of groups to be formed based on the given data. Furthermore, 



since TLP-SG shrinks the effects sizes |jS,|, not /i,-, toward each 
other, it is robust to varying association directions of the causal 
variants. However, based on our studies on both simulated and 
real sequence data, we found that TLP-SG and other penalized 
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Table 4 | Means, sd's and MSEs of some causal {fics) and non-causal iPncs) variants' regression coefficient estimates when ft = 30 In Case 2 
with both RVs and CVs. 

Methods ^cs = 1-5 ^cs = 1-5 fines = 0 



Mean sd MSE Mean sd MSE Mean sd MSB 



OLS 1.59 1.26 3.16 1.54 1.53 4.69 -0.04 1.37 3.77 

Lasso 0.93 0.87 1.82 0.84 0.81 1.74 0.01 0.47 0.45 

gflassOr = cor 0.88 0.93 2.11 0.80 0.86 1.96 0.01 0.57 0.64 

gflassOr=i 0.83 0.92 2.15 0.74 0.86 2.05 0.02 0.55 0.61 

TLP-S 1.35 1.14 2.60 1.25 1.15 2.70 -0.03 0.85 1.45 

TLP-SG 1.28 1.16 2.72 1.29 1.15 2.70 0.01 0.85 1.44 



Table 5 | MAFs (%) and pair-wise correlations (COR) In the values of (min, mean, max) for the 12 genes influencing the quantitative trait Q2 In 
the GAW17 data. 



Gene 




All 


Causal 


Non-causal 


PLAT 




(0.072,2.098,45.12) 


(0.072,0.206,0.574) 


(0.072,2.855,45.12) 


SREBF1 




(0.072,0.699,7747) 


(0.072,0.222,0.43) 


(0.072,1.04,7747) 


SIRT1 




(0.072,0.858,16.71) 


(0.072,0.12,0.215) 


(0.072,1.332,16.71) 


VLDLR 




(0.072,1.0479.469) 


(0.072,0.126,0.287) 


(0.072,1.435,9.469) 


VNN3 




(0.072,4.429,40.53) 


(0.072,2.06,9.828) 


(0.072,6.501,40.53) 


PDGFD 


MAF 


(0.072,4.115,31.56) 


(0.072,0.2870.861) 


(0.072,6.303,31.56) 


BCHE 


(0.072,0.625,14.56) 


(0.072,0.105,0.287) 


(0.072,1.076,14.56) 


INSIG1 




(0.072,0.775,3.587) 


(0.072,0.072,0.072) 


(0.072,1.829,3.587) 


LPL 




(0.072,1.854,14.490) 


(0.072,0.598,1.578) 


(0.072,2.076,14.490) 


RARE 




(0.072,0.352,1.363) 


(0.072,0.2870.502) 


(0.072,0.3671.363) 


VNN1 




(0.072,2.675,17070) 


(0.574,8.824,17070) 


(0.072,0.215,0.359) 


VWF 




(0.072,0.944,2.080) 


(0.072,0.323,0.574) 


(0.359,1.255,2.080) 


PLAT 




(-0.143,0.002,0.753) 


(-0.008,-0.003,-0.001) 


(-0.143,0.0070.753) 


SREBF1 




(-0.038,0.0070.635) 


(-0.009,-0.004,-0.001) 


(-0.038,0.024,0.635) 


SIRT1 




(-0.044,0.004,0.707) 


(-0.004,0.0070.33) 


(-0.044,0.002,0.499) 


VLDLR 




(-0.135,-0.001,0.331) 


(-0.003,-0.002,-0.001) 


(-0.135,0.001,0.331) 


VNN3 




(-0.422,-0.002,0.59) 


(-0.104,-0.01,0.072) 


(-0.422,-0.001,0.341) 


PDGFD 


COR 


(-0.156,-0.0070.276) 


(-0.007-0.004,-0.001) 


(-0.156,-0.0070.276) 


BCHE 


(-0.044,0.001,0.499) 


(-0.005,0.004,0.499) 


(-0.044,-0.002,0.075) 


INSIG1 




(-0.010,0.009,0.128) 


(-0.001,-0.001,-0.001) 


(0.128,0.128,0.128) 


LPL 




(-0.138,-0.002,0.215) 


(-0.010,-0.006,-0.002) 


(-0.138,-0.002,0.215) 


RARB 




(-0.025,-0.003,0.073) 


(-0.004,-0.004,-0.004) 


(-0.025,-0.005,-0.001) 


VNN1 




(-0.046,0.038,0.945) 


(0.055,0.055,0.055) 


(-0.005,0.091,0.945) 


VWF 




(0.113,0.316,0.564) 


(0.265,0.265,0.265) 


(0.1270.246,0.466) 



methods sometimes might be more powerful than some existing 
global tests, though they do not always outperform the SSU or 
SSUw test. The discovery of no uniform gain of penalized meth- 
ods over existing global tests is interesting and even surprising, 
and can be due to non-optimal implementation of the penal- 
ized methods in several aspects. First, the selection of the tuning 
parameters based on the model selection criterion AIC may not 
be optimal. As an example, in a simulated dataset, when we set 
the tuning parameters to properly group the variants, the esti- 
mates were quite close to the true values, but the corresponding 
AIC was less desirable, leading to choosing other low perform- 
ing tuning parameters. Importantly, there is no theory yet to 
justify the applicability of AIC for the gflasso- and TLP-based 



methods; in particular, it is unclear how to count the effective 
number of parameters in the AIC. Alternatively, one may want 
to try a more popular model selection method, multi-fold cross- 
validation. However, for RVs as considered here, if we divide 
the data into multiple folds, the training data may contain sev- 
eral monomorphic variants, causing non-identifiability of their 
corresponding effect sizes. Second, due to the repeated model- 
fitting with many permuted datasets, to save computing time, we 
only searched relatively few grid points for the tuning parameters, 
which might not have covered some suitable tuning parameter 
values. These are all issues to be addressed in the future. 

Another non-convex penalty is SCAD (Fan and Li, 2001), 
which as TLP aims to reduce the biases of large coefficient 
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Table 6 | Empirical power based on the GAW17 data from 200 
replicates of Q2, k, and nC denote the numbers of the total and 
causal variants in a gene. 



Table 7 | Empirical power based on the GAW17 data without CVs 
from 200 replicates of Q2, k, and nC denote the numbers of the total 
and causal RVs in a gene. 



Model fitting Test 
stats 



Gene(k, nO 



IVIodel fitting Test 
stats 



Gene(fc, nO 







PLAT 


SREBF1 


SIRT1 


VLDLR 


VNN3 


PDGFD 






PLAT 


SREBF1 


SIRT1 


VLDLR 


VNN3 






(28,8) 


(24,10) 


(23,9) 


(27,8) 


(15,7) 


(11,4) 






(26,8) 


(23,10) 


(22,9) 


(24,8) 


(12,6) 


OLS 


F-test 


0.070 


0.275 


0.360 


0.155 


0.640 


0.340 


OLS 


F-test 


0.070 


0.295 


0.305 


0.135 


0.435 


OLS 


Score 


0.060 


0.260 


0.355 


0.155 


0.640 


0.335 


OLS 


Score 


0.065 


0.295 


0.305 


0.135 


0.430 


OLS 


SSU 


0.040 


0.025 


0.355 


0.055 


0.185 


0.060 


OLS 


SSU 


0.040 


0.095 


0.430 


0.075 


0.225 


OLS 


SSUw 


0.035 


0.245 


0.445 


0.155 


0.555 


0.320 


OLS 


SSUw 


0.055 


0.270 


0.375 


0.130 


0.390 


OLS 


UminP 


0.065 


0.185 


0.420 


0.120 


0.555 


0.310 


OLS 


UminP 


0.060 


0.190 


0.390 


0.125 


0.410 


OLS 


Sum 


0.040 


0.075 


0.560 


0.065 


0.410 


0.055 


OLS 


Sum 


0.055 


0.260 


0.350 


0.105 


0.265 


OLS 


aSum 


0.070 


0.130 


0.565 


0.095 


0.415 


0.075 


OLS 


aSum 


0.085 


0.295 


0.380 


0.155 


0.270 


Lasso 


Idf 


0.100 


0.270 


0.285 


0.110 


0.595 


0.300 


Lasso 


Idf 


0.105 


0.255 


0.265 


0.140 


0.275 


gflaSSOr= cor 


Idf 


0.085 


0.195 


0.225 


0.135 


0.555 


0.290 


gflass0r= cor 


Idf 


0.105 


0.195 


0.220 


0.110 


0.255 


gflassOr=i 


Idf 


0.085 


0.215 


0.225 


0.135 


0.570 


0.300 


gflassOr=i 


Idf 


0.100 


0.225 


0.210 


0.110 


0.280 


TLP-S 


Idf 


0.065 


0.290 


0.330 


0.130 


0.630 


0.325 


TLP-S 


Idf 


0.065 


0.280 


0.295 


0.175 


0.355 


TLP-SG 


Idf 


0.025 


0.090 


0.165 
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estimates resulting from the Lasso or Li penalty. Although SCAD 
can be equally applied and compared here, we chose the TLP 
as a representative of non-convex penalties for its good proper- 
ties: as shown by Shen et al. (2012), Lq regularization is opti- 
mal in variable selection, and its computational surrogate, TLP, 
shares the same property for sufficiently small tau; furthermore, 
the variable selection consistency of TLP regularization also led 
to enhanced parameter estimation and prediction in numeri- 
cal studies with finite sample sizes. Nevertheless, we note that, 
penalized regression methods have been intensively studied for 
high-dimensional data, but not for the type of data consid- 
ered here, which are low dimensional but with RVs as sparse 
predictors. 

In summary, the established benefit of penalized regression for 
variable selection and risk prediction for high-dimensional data 



Maximum power in bold. 

(Kooperberg et al, 2010) did not seem to directly translate into 
substantial power gains in genetic association testing. In addition 
to the current work, there exist three recent reports (Croiseau and 
Cordell, 2009; Martinez et al, 2010; Basu et al, 2011) questioning 
the effectiveness of the Lasso penalized regression in hypothe- 
sis testing, while Basu et al. (2011) showed that several variable 
selection approaches did not outperform some global tests (e.g., 
the SSU or SSUw test) for association analysis of CVs. Due to 
the limitations mentioned above, we cannot conclude here that 
any penalized regression method would not outperform exiting 
global association tests; rather, further investigation on enhanced 
tuning parameter selection and better choice of the test statistic is 
warranted. Finally, we note that the capability of variable selection 
by penalized regression can be useful, e.g., in narrowing down 
causal variants. 
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Table 8 | Mean numbers of TP(sd)/FP(sd) in the GAW17 data, where q^ and qO denote the numbers of the causal and non-causal variants in 
each gene. 
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